311 research outputs found
Learning to Identify Ambiguous and Misleading News Headlines
Accuracy is one of the basic principles of journalism. However, it is
increasingly hard to manage due to the diversity of news media. Some editors of
online news tend to use catchy headlines which trick readers into clicking.
These headlines are either ambiguous or misleading, degrading the reading
experience of the audience. Thus, identifying inaccurate news headlines is a
task worth studying. Previous work names these headlines "clickbaits" and
mainly focus on the features extracted from the headlines, which limits the
performance since the consistency between headlines and news bodies is
underappreciated. In this paper, we clearly redefine the problem and identify
ambiguous and misleading headlines separately. We utilize class sequential
rules to exploit structure information when detecting ambiguous headlines. For
the identification of misleading headlines, we extract features based on the
congruence between headlines and bodies. To make use of the large unlabeled
data set, we apply a co-training method and gain an increase in performance.
The experiment results show the effectiveness of our methods. Then we use our
classifiers to detect inaccurate headlines crawled from different sources and
conduct a data analysis.Comment: Accepted by IJCAI 201
Visual Information Guided Zero-Shot Paraphrase Generation
Zero-shot paraphrase generation has drawn much attention as the large-scale
high-quality paraphrase corpus is limited. Back-translation, also known as the
pivot-based method, is typical to this end. Several works leverage different
information as "pivot" such as language, semantic representation and so on. In
this paper, we explore using visual information such as image as the "pivot" of
back-translation. Different with the pipeline back-translation method, we
propose visual information guided zero-shot paraphrase generation (ViPG) based
only on paired image-caption data. It jointly trains an image captioning model
and a paraphrasing model and leverage the image captioning model to guide the
training of the paraphrasing model. Both automatic evaluation and human
evaluation show our model can generate paraphrase with good relevancy, fluency
and diversity, and image is a promising kind of pivot for zero-shot paraphrase
generation.Comment: Accepted By COLING 202
A Comprehensive Evaluation of Constrained Text Generation for Large Language Models
Advancements in natural language generation (NLG) and large language models
(LLMs) have led to proficient text generation in various tasks. However,
integrating intricate constraints into neural text generation, due to LLMs'
opacity, remains challenging. This study investigates constrained text
generation for LLMs, where predefined constraints are applied during LLM's
generation process. Our research examines multiple LLMs, including ChatGPT and
GPT-4, categorizing constraints into lexical, structural, and relation-based
types. We also present various benchmarks to facilitate fair evaluation. The
study addresses some key research questions, including the extent of LLMs'
compliance with constraints. Results illuminate LLMs' capacity and deficiency
to incorporate constraints and provide insights for future developments in
constrained text generation. Codes and datasets will be released upon
acceptance.Comment: Work in progres
- …